Corpus: pus_newscrawl_2011_30K

Other corpora

2.2.5 Most frequent word beginnings

The most frequent word beginnings as character N-grams for N=1...5 with Zipf's diagram


Zipf's diagram for word beginnings


Gnuplot diagram

Top Characters
word rank frequency n-gram
1 6834 ا-
2 6188 م-
3 5884 د-
4 5250 و-
5 3848 ب-
Top Character Bigrams
word rank frequency n-gram
1 1664 او-
2 1233 را-
3 972 دا-
4 908 ور-
5 821 در-
Top Character Trigrams
word rank frequency n-gram
1 306 راو-
2 305 می‌-
3 252 کار-
4 247 است-
5 210 اود-
Top Character 4-Grams
word rank frequency n-gram
1 108 عبدا-
2 74 محمد-
3 63 افغا-
4 56 نمی‌-
5 54 استا-
Top Character 5-Grams
word rank frequency n-gram
1 103 عبدال-
2 60 افغان-
3 46 درلود-
4 36 څرګند-
5 32 برابر-
858 msec needed at 2018-03-21 10:55